-
Notifications
You must be signed in to change notification settings - Fork 25.6k
[ESQL] Support first/last functions. #136419
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- Change `AggregatorImplementor` to use `Argument` instead of `AggregationParameter`. - Compiling esql module resulted in only 2 modified `SpatialExtent*` classes
Pinging @elastic/es-analytical-engine (Team:Analytics) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
if (invokeBlockEval.length == 1) { | ||
builder.addStatement(invokeBlockEval[0]); | ||
builder.endControlFlow(); | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can remove this condition here and just do the other one always
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. Just have the for loop code, and skip the length==1
check.
if (type.equals(BYTES_REF)) { | ||
params += ", " + name + "Scratch"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can reuse isBytesRef() and scratchName() here (?)
MethodSpec.Builder builder = initAddRaw(true, masked); | ||
if (aggParams.getFirst().isArray()) { | ||
if (aggParams.getFirst() instanceof BlockArgument) { | ||
builder.addComment("This type does not support vectors because all values are multi-valued"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not related with this PR, but I think this should be an assertion? If for any reason code goes through this, it feels like an error.
However, now that you're not calling it, maybe we shouldn't create this method to begin with, if it won't be usable. Can we skip its creation?
If it adds extra complexity, it can be left for another PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Btw, I don't know much about why these functions can't receive single-valued positions; @craigtaverner will know better if removing/asserting will change logics in any way
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not quite sure I'm following the code here. For the spatial point types, we should get vectors of single valued fields. The new code seems to disable this. I think I would need to investigate further to understand why these changes have this effect, because at first glance the switch from isArray to BlockArgument seems correct, but is clearly not working as I'd expect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did a quick review, and it seems to be like something is not quite right. It looks like the vector optimization for spatial points has been disabled.
*/ | ||
void read(MethodSpec.Builder builder, boolean blockStyle); | ||
|
||
void read(MethodSpec.Builder builder, String accessor, String firstParam); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably needs javadoc like the adjacent methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, since all but one implementation is an empty implementation, we could use a default
implementation here in the interface?
builder.beginControlFlow("for (int i = 0; i < $LBlocks.length; i++)", name); | ||
builder.addStatement("$LVectors[i] = $LBlocks[i].asVector()", name, name); | ||
builder.beginControlFlow("if ($LVectors[i] == null)", name).addStatement(invokeBlockEval).endControlFlow(); | ||
builder.beginControlFlow("if ($LVectors[i] == null)", name).addStatement(invokeBlockEval[0]).endControlFlow(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we should assert that the array has exactly one element?
} | ||
|
||
@Override | ||
public void read(MethodSpec.Builder builder, String accessor, String firstParam) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be deleted if we make a default method in the interface.
if (invokeBlockEval.length == 1) { | ||
builder.addStatement(invokeBlockEval[0]); | ||
builder.endControlFlow(); | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. Just have the for loop code, and skip the length==1
check.
builder.addStatement("$T $L = $L.asVector()", vectorType(p.type()), p.vectorName(), p.blockName()); | ||
builder.beginControlFlow("if ($L == null)", p.vectorName()); | ||
|
||
boolean isBlockArgument = aggParams.getFirst() instanceof BlockArgument; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand correctly, we're making the decision if the field is stored as a vector rather than a block up-front in the code generation, instead of at runtime in the evaluator? The consequences for the spatial extents function are that vector optimizations will never occur, and the assumption here is that they never can occur anyway. I'm not sure I quite follow that assumption.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I remember correctly, we use vector optimization if all the values in the segment are single-valued. And the evaluators that were changed here were spatial extent aggregation over geo_point and cartesian point, which are almost always single-valued in the data-sets I've worked with.
return; | ||
} | ||
addRawVector(valuesVector, mask); | ||
addRawBlock(valuesBlock, mask); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feels like we've disabled the optimization for single-valued points. I'd expect a possible performance regression here.
MethodSpec.Builder builder = initAddRaw(true, masked); | ||
if (aggParams.getFirst().isArray()) { | ||
if (aggParams.getFirst() instanceof BlockArgument) { | ||
builder.addComment("This type does not support vectors because all values are multi-valued"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not quite sure I'm following the code here. For the spatial point types, we should get vectors of single valued fields. The new code seems to disable this. I think I would need to investigate further to understand why these changes have this effect, because at first glance the switch from isArray to BlockArgument seems correct, but is clearly not working as I'd expect.
AggregatorImplementor
to useArgument
instead ofAggregationParameter
.SpatialExtent*
classes